Supporting Data Integrity

Let's understand the downside of using the EAV database design in detail.

We'll cover the following

We can’t make mandatory attributes
You can’t use SQL data types
Enforced referential integrity
You can’t make up attribute names

When we use EAV, we sacrifice many advantages that a conventional database design would have given us.

We can’t make mandatory attributes#

Returning to the example stated above, to should also require that the date_reported attribute has a value. In a conventional database design, it would be simple to enforce a mandatory column by declaring the column NOT NULL.

In the EAV design, each attribute corresponds to a row in the IssueAttributes table, not a column. You would need a constraint that checks that a row exists for each issue_id value, and the row must have the string date_reported in its attr_name column.

However, SQL doesn’t support a constraint that can do this. So, you must write the application code to enforce it. If you do find a bug with no reported date, should you add a value for this attribute? What value should you give it? If you make a guess or use some default value for a missing attribute, how does that affect the accuracy of your boss’s reports?

You can’t use SQL data types#

Your boss tells you that they are having trouble running their report because people have entered dates in different formats or sometimes even strings that aren’t dates. In a conventional database, you can prevent this if you declare the column with the DATE data type.

Enter to Rename, Shift+Enter to Preview

Inserting invalid data in the field (after changing the data type to DATE)

Try it yourself in the following playground.

Enter to Rename, Shift+Enter to Preview

Retrieving data after inserting invalid data in the field

In the EAV design, the data type of the IssueAttributes.attr_value column is typically a string to accommodate all possible attributes in a single column. So, it has no way of rejecting invalid data.

Enter to Rename, Shift+Enter to Preview

Inserting invalid data in the attr_value in EAV design

Try it yourself in the following playground.

Enter to Rename, Shift+Enter to Preview

Inserting invalid data in the attr_value in EAV design

Some people try to extend the EAV design by defining a separate attr_value column for each SQL data type, leaving null in the unused columns. Doing so would allow you to use data types but it would make queries even worse:

Enter to Rename, Shift+Enter to Preview

Defining a separate attr_value column for each SQL data type

You would need to add even more columns to support user-defined data types or domains.

Enforced referential integrity#

In a conventional database, we can restrict the range of some attributes by defining a foreign key to a lookup table. For example, the status attribute of a bug or issue should be one of a small list of values stored in the BugStatus table.

Enter to Rename, Shift+Enter to Preview

Creating a table having status as a foreign key

In the EAV design, we can’t apply this kind of constraint on the attr_value column. A referential integrity constraint applies to every row in the table.

If we define this constraint, it will force every attribute to match a value in BugStatus, not just the status attribute.

Enter to Rename, Shift+Enter to Preview

Inserting data in IssueAttributes table

Let’s try it in the following playground. We will add the given code above and see how it works.

Enter to Rename, Shift+Enter to Preview

Creating table having status as foreign key and inserting data

We will see an error, which occurs as a result of using a column ofBugStatus as a foreign key. It would need to have the same data available in the column status of the BugStatus table to be entered in the attr_value column.

You can’t make up attribute names#

Your boss’ reports are still not reliable. You find that attributes are not being named consistently. One bug uses an attribute named by the string date_ reported, but another bug names the attribute by the string report_date. Both are clearly intended to represent the same information.

How would you count the number of bugs per date?

Run the code in the following playground.

Enter to Rename, Shift+Enter to Preview

Counting the bugs per date

How would you know if a given bug hasn’t stored an attribute by yet another name? How would you know if a given bug has stored a given attribute twice by two different names? How can you prevent such mistakes?

One remedy may be to declare a foreign key on the attr_name column that links to a lookup table that contains your approved attribute names. However, this doesn’t support attributes you define on the fly for each entity, even though that’s a common practice when using the EAV design.

Antipattern: Use a Generic Attribute Table

Solution: Model the Subtypes Using Inheritance

Mark as Completed

Report an Issue

Introduction

Logical Antipattern - Jaywalking

Logical Antipattern - Naive Trees

Logical Antipattern - ID Required

Logical Antipattern - Keyless Entry

Assessment: Logical Antipatterns - Part 1

Logical Antipattern - Entity-Attribute-Value

Logical Antipattern - Polymorphic Associations

Logical Antipattern - Multicolumn Attributes

Logical Antipattern - Metadata Tribbles

Assessment: Logical Antipatterns - Part 2

Physical Antipattern - Rounding Errors

Physical Antipattern - 31 Flavors

Physical Antipattern - Phantom Files

Physical Antipattern - Index Shotgun

Assessment: Physical Antipatterns

Query Antipattern - Fear of the Unknown

Query Antipattern - Ambiguous Groups

Query Antipattern - Random Selection

Query Antipattern - Poor Man’s Search Engine

Query Antipattern - Spaghetti Query

Query Antipattern - Implicit Columns

Assessment: Query Antipattern

Application Development Antipattern - Readable Passwords

Application Development Antipattern - SQL Injection

Application Development Antipattern - Pseudokey Neat-Freak

Application Development Antipattern - See No Evil

Application Development Antipattern - Diplomatic Immunity

Application Development Antipattern - Magic Beans

Assessment: Application Development Antipattern

Conclusion

Supporting Data Integrity

We can’t make mandatory attributes#

You can’t use SQL data types#

Enforced referential integrity#

You can’t make up attribute names#